Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles
نویسندگان
چکیده
Parsing accuracy using efficient greedy transition systems has improved dramatically in recent years thanks to neural networks. Despite striking results in dependency parsing, however, neural models have not surpassed stateof-the-art approaches in constituency parsing. To remedy this, we introduce a new shiftreduce system whose stack contains merely sentence spans, represented by a bare minimum of LSTM features. We also design the first provably optimal dynamic oracle for constituency parsing, which runs in amortized O(1) time, compared to O(n) oracles for standard dependency parsing. Training with this oracle, we achieve the best F1 scores on both English and French of any parser that does not use reranking or external data.
منابع مشابه
A Minimal Span-Based Neural Constituency Parser
In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans. We show that this model is not only compatible with classical dynamic programming techniques, but also admits a novel greedy top-down inference algorithm based on recursive partitioning of the input. We demonstrate empirically that both prediction schemes are competitive wi...
متن کاملDiscontinuous parsing with continuous trees
We introduce a new method for incremental shift-reduce parsing of discontinuous constituency trees, based on the fact that discontinuous trees can be transformed into continuous trees by changing the order of the terminal nodes. It allows for a clean formulation of different oracles, leads to faster parsers and provides better results. Our best system achieves an F1 of 80.02 on TIGER.
متن کاملA Dynamic Oracle for Arc-Eager Dependency Parsing
The standard training regime for transition-based dependency parsers makes use of an oracle, which predicts an optimal transition sequence for a sentence and its gold tree. We present an improved oracle for the arc-eager transition system, which provides a set of optimal transitions for every valid parser configuration, including configurations from which the gold tree is not reachable. In such...
متن کاملJoint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank
Discourse parsing has long been treated as a stand-alone problem independent from constituency or dependency parsing. Most attempts at this problem are pipelined rather than end-to-end, sophisticated, and not self-contained: they assume goldstandard text segmentations (Elementary Discourse Units), and use external parsers for syntactic features. In this paper we propose the first end-to-end dis...
متن کاملMultilingual Lexicalized Constituency Parsing with Word-Level Auxiliary Tasks
We introduce a constituency parser based on a bi-LSTM encoder adapted from recent work (Cross and Huang, 2016b; Kiperwasser and Goldberg, 2016), which can incorporate a lower level character biLSTM (Ballesteros et al., 2015; Plank et al., 2016). We model two important interfaces of constituency parsing with auxiliary tasks supervised at the word level: (i) part-of-speech (POS) and morphological...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016